13 research outputs found

    Unleashing the Power of Hashtags in Tweet Analytics with Distributed Framework on Apache Storm

    Full text link
    Twitter is a popular social network platform where users can interact and post texts of up to 280 characters called tweets. Hashtags, hyperlinked words in tweets, have increasingly become crucial for tweet retrieval and search. Using hashtags for tweet topic classification is a challenging problem because of context dependent among words, slangs, abbreviation and emoticons in a short tweet along with evolving use of hashtags. Since Twitter generates millions of tweets daily, tweet analytics is a fundamental problem of Big data stream that often requires a real-time Distributed processing. This paper proposes a distributed online approach to tweet topic classification with hashtags. Being implemented on Apache Storm, a distributed real time framework, our approach incrementally identifies and updates a set of strong predictors in the Na\"ive Bayes model for classifying each incoming tweet instance. Preliminary experiments show promising results with up to 97% accuracy and 37% increase in throughput on eight processors.Comment: IEEE International Conference on Big Data 201

    Efficiency Mechanisms For A Class Of Blackboard Systems

    No full text
    This paper presents efficient mechanisms for activation, execution and rating that are suitable for use in BB1-style blackboard architectures. We describe a knowledge source compiler that produces match networks and demons for efficient activation and rating while compiling the entire system for increased execution speed. Experiments using the enhancements in a general-purpose blackboard shell illustrate approximately a doubling of run time speed, including an increase in activation speed by a factor of 7.6 on the average. We have also resolved a subclass of blackboard systems that can be compiled down to the machine level by using a condensed representation where low-level blackboard accesses are replaced by vector references. Our analysis shows that the time complexity of the execution cycle of a condensed system is faster than the conventional approach by the ratio of the time required for blackboard retrievals to the time required for vector element retrievals. In practice, this ra..

    Efficiency Mechanisms for a Class of Blackboard Systems

    No full text

    Analytics on Anonymity for Privacy Retention in Smart Health Data

    No full text
    Advancements in smart technology, wearable and mobile devices, and Internet of Things, have made smart health an integral part of modern living to better individual healthcare and well-being. By enhancing self-monitoring, data collection and sharing among users and service providers, smart health can increase healthy lifestyles, timely treatments, and save lives. However, as health data become larger and more accessible to multiple parties, they become vulnerable to privacy attacks. One way to safeguard privacy is to increase users’ anonymity as anonymity increases indistinguishability making it harder for re-identification. Still the challenge is not only to preserve data privacy but also to ensure that the shared data are sufficiently informative to be useful. Our research studies health data analytics focusing on anonymity for privacy protection. This paper presents a multi-faceted analytical approach to (1) identifying attributes susceptible to information leakages by using entropy-based measure to analyze information loss, (2) anonymizing the data by generalization using attribute hierarchies, and (3) balancing between anonymity and informativeness by our anonymization technique that produces anonymized data satisfying a given anonymity requirement while optimizing data retention. Our anonymization technique is an automated Artificial Intelligent search based on two simple heuristics. The paper describes and illustrates the detailed approach and analytics including pre and post anonymization analytics. Experiments on published data are performed on the anonymization technique. Results, compared with other similar techniques, show that our anonymization technique gives the most effective data sharing solution, with respect to computational cost and balancing between anonymity and data retention

    Intelligent Monitoring and Control

    No full text
    Intelligent monitoring and control involves observing and guiding the behavior of a physical system toward some objective, with realtime constraints on the utility of particular actions. Generic functional requirements for this task include: integration of perception, reasoning, and action; integration of multiple reasoning activities; reasoning about complex, timevarying systems; coordination of multiple response modes; dynamic allocation of limited computational resources. We illustrate these requirements in the domain of patient monitoring in a surgical intensive care unit (SICU). We propose a generic architecture, designed and implemented in layers: top-level system organization; reasoning architecture; generic reasoning skills and knowledge representation; firstprinciples knowledge of physical systems; domain knowledge. We illustrate the architecture in the "Guardian " system for SICU monitoring and describe Guardian's performance on an illustrative scenario. Finally, we discuss the generality and limitations of the proposed architecture. 1 The Problem Intelligent monitoring and control involves observing and guiding the behavior of a physical system toward some objective, with real-time constraints on the utility of particular actions. Control theory ([Bollinger and Duffie, 1988],[Hale, 1973]) is useful for tasks that permit a straightforward mapping between sensed data values and appropriate control actions. By contrast, we are con-*This research was supported by grants from DARPA and NIH and by gifts from Rockwell International Corp. and FMC Corp. The Palo Alto VAMC supports Adam Seiver's participation in the project. We gratefully acknowledge contributions by Reed Hastings and Nicholas Parlante to an early version of Guardian and to Adnan Darwiche, who recently joined the project. We have benefitted from discussions with Lawrence Fagan and his students, who are working on a related 1CU monitoring system, called 'QQ. ' We thank Edward Feigenbaum for sponsoring the work at the Knowledge System

    Architectural Foundations for Real-Time Performance in Intelligent Agents

    No full text
    Intelligent agents perform multiple concurrent tasks requiring both knowledge-based reasoning and interaction with dynamic entities in the environment, under real-time constraints. Because an agent's opportunities to perceive, reason about, and act upon the environment typically exceed its computational resources, it must determine which operations to perform and when to perform them so as to achieve its most important objectives in a timely manner. Accordingly, we view the problem of real-time performance as a problem in intelligent real-time control. We propose and define several important control requirements and present an agent architecture that is designed to address those requirements. The proposed architecture is a blackboard architecture, whose key features include: distribution of perception, action, and cognition among parallel processes, limited-capacity I/O buffers with best-first retrieval and worst-first overflow, dynamic control planning, dynamic focus of attention, and..
    corecore